Diagnostic Evaluation of Machine Translation Systems Using Automatically Constructed Linguistic Check-Points

نویسندگان

  • Ming Zhou
  • Bo Wang
  • Shujie Liu
  • Mu Li
  • Dongdong Zhang
  • Tiejun Zhao
چکیده

We present a diagnostic evaluation platform which provides multi-factored evaluation based on automatically constructed check-points. A check-point is a linguistically motivated unit (e.g. an ambiguous word, a noun phrase, a verb~obj collocation, a prepositional phrase etc.), which are pre-defined in a linguistic taxonomy. We present a method that automatically extracts check-points from parallel sentences. By means of checkpoints, our method can monitor a MT system in translating important linguistic phenomena to provide diagnostic evaluation. The effectiveness of our approach for diagnostic evaluation is verified through experiments on various types of MT systems.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Introduction to China’s CWMT2008 Machine Translation Evaluation

This paper presents an overall introduction to the CWMT2008 evaluation and focuses on its two new metrics: BLEU-SBP (Chiang et al., 2008) and linguistic check-point method (Zhou et al., 2008). BLEU-SBP is a revised BLEU with strict brevity penalty. Our experiments validated BLEU-SBP’s effectivity in resolving the nondecomposability problem of both NIST-BLEU and IBMBLEU at sentence level. Lingui...

متن کامل

Woodpecker: An Automatic Methodology for Machine Translation Diagnosis with Rich Linguistic Knowledge

Different from the “black-box” evaluation, the diagnostic evaluation aims to provide a better explanatory power into various aspects of the performance of artificial intelligence systems. However, for machine translation (MT) systems, due to its complexity and knowledge dependency, such diagnostic evaluation often demands a large amount of manual work. To tackle this problem, we propose an auto...

متن کامل

The Correlation of Machine Translation Evaluation Metrics with Human Judgement on Persian Language

Machine Translation Evaluation Metrics (MTEMs) are the central core of Machine Translation (MT) engines as they are developed based on frequent evaluation. Although MTEMs are widespread today, their validity and quality for many languages is still under question. The aim of this research study was to examine the validity and assess the quality of MTEMs from Lexical Similarity set on machine tra...

متن کامل

A Framework for Diagnostic Evaluation of MT Based on Linguistic Checkpoints

This paper describes an approach to the diagnostic evaluation of machine translation (MT) based on linguistic checkpoints, which can provide valuable information both to the developers and to the end-users of MT systems. We present a flexible framework and a new tool, DELiC4MT, for fine-grained diagnostic MT evaluation which can be extended to any language pair and applied to any evaluation tar...

متن کامل

DELiC4MT: A Tool for Diagnostic MT Evaluation over User-defined Linguistic Phenomena

This paper demonstrates DELiC4MT, a piece of software that allows the user to perform diagnostic evaluation of machine translation systems over linguistic checkpoints, i.e., sourcelanguage lexical elements and grammatical constructions specified by the user. Our integrated tool builds upon best practices, software components and formats developed under different projects and initiatives, focusi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008